Primary analysis
The main objective analysis will compare the proportion of pneumococcal carriage between the PCV13 and the saline arms. As stated above in the analysis population section, the main analysis will use data from all participants, whether inoculated with the 20,000, the 80,000 or the 160,000 CFU dose. This comparison will be done using a log-binomial regression model, with predictor variables for inoculation dose group (the CFU 160,000 group will be used as reference as the largest group) and vaccine arm:
\[
\log(\pi) = \log\left(E[Y|X_{20,000},X_{80,000},X_{PCV13}]\right) = \beta_0 + \beta_{20,000}\cdot X_{20,000} + \beta_{80,000}\cdot X_{80,000} + \beta_{PCV13}\cdot X_{PCV13}
\]
where \(\pi\) is the probability of being a carrier, \(Y\) is a random variable indicating serotype 6B carriage status, \(X_{20,000}\) is an indicator variable for whether or not a participant was inoculated at CFU 20,000, similarly for \(X_{80,000}\), and \(X_{PCV13}\) is an indicator variable for vaccination (with PCV13 vaccine) status. \(\beta_0, \beta_{20,000}, \beta_{80,000}, \beta_{PCV13}\) are the regression coefficients that will be simulated. \(Y|X_{20,000},X_{80,000},X_{PCV13}\) follows a binomial distribution with parameter \(\pi\).
For our main analysis of the effectiveness of the PCV13 vaccine, we will test the null hypothesis \(H_0\) of no effect of the PCV13 vaccine against the alternative hypothesis \(H_1\) that it has an effect on the carriage probability:
\[
H_0:\qquad\beta_{PCV13} = 0 \\
H_1:\qquad\beta_{PCV13} \neq 0
\]
We will report the estimated risk ratio from the log-binomial regression model \(exp(\hat{\beta}_{PCV13})\) together with its 95% confidence interval.
Log-binomial models can fail to converge or to compute standard errors. We will use the logbin R package which overcomes most of these issues. However, occasionally the log-binomial model will still fail to fit. In this case, the analysis will revert to a logistic regression model and adjusted odds ratios will be reported rather than adjusted relative risks.
Results from the log-binomial regression model will be displayed as a set of 2x2 tables (one for each inoculation dose and one for the data from all doses combined), stacked on top of each other. While we report the p-values from exact Fisher tests for each 2x2 table, the primary analysis and conclusion are based on the p-value from the log-binomial regression as detailed above.
simDat<-simDat %>%
mutate(doseGroup=factor(paste(sep=" ","CFU",dose),levels=c("CFU 160000","CFU 80000","CFU 20000")))
mod<-logbin::logbin(carriage~VaccName+doseGroup,data=simDat)
simDat20<-simDat %>% dplyr::filter(dose==20000)
simDat80<-simDat %>% dplyr::filter(dose==80000)
simDat160<-simDat %>% dplyr::filter(dose==160000)
tab20<-table(simDat20$VaccName,simDat20$carriage)
tab80<-table(simDat80$VaccName,simDat80$carriage)
tab160<-table(simDat160$VaccName,simDat160$carriage)
tabAll<-table(simDat$VaccName,simDat$carriage)
tabFull<-rbind(tab20,tab80,tab160,tabAll)
pFish20<-fisher.test(tab20)$p.value
pFish20<-ifelse(pFish20>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish20)))," < 0.001")
pFish80<-fisher.test(tab80)$p.value
pFish80<-ifelse(pFish80>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish80)))," < 0.001")
pFish160<-fisher.test(tab160)$p.value
pFish160<-ifelse(pFish160>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish160)))," < 0.001")
pFishAll<-fisher.test(tabAll)$p.value
pFishAll<-ifelse(pFishAll>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFishAll)))," < 0.001")
pGlm<-summary(mod)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
rrGlmVacc<-exp(c(summary(mod)$coefficients["VaccNamePCV-13","Estimate"],confint(mod)["VaccNamePCV-13",]))
tabFull %>%
kable(caption=paste(sep="","(DUMMY TABLE) Number of participants colonised with experimental S. Pneumoniae serotype 6B by study arm and inoculation dose. According to a log-binomial regression model which also adjusts for inoculation dose, there is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),")."),col.names=c("Not colonised","Colonised"),row.names=T) %>%
kable_styling(full_width = F) %>%
pack_rows(paste(sep="","CFU 20,000 (Fisher test p",pFish20,")"),1,2) %>%
pack_rows(paste(sep="","CFU 80,000 (Fisher test p",pFish80,")"),3,4) %>%
pack_rows(paste(sep="","CFU 160,000 (Fisher test p",pFish160,")"),5,6) %>%
pack_rows(paste(sep="","All doses (Fisher test p",pFishAll,")"),7,8)
Table 4.2: (DUMMY TABLE) Number of participants colonised with experimental S. Pneumoniae serotype 6B by study arm and inoculation dose. According to a log-binomial regression model which also adjusts for inoculation dose, there is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.52, 95% CI (0.37, 0.74), p < 0.001).
|
|
Not colonised
|
Colonised
|
|
CFU 20,000 (Fisher test p = 0.605)
|
|
Saline
|
17
|
3
|
|
PCV-13
|
19
|
1
|
|
CFU 80,000 (Fisher test p = 0.232)
|
|
Saline
|
24
|
16
|
|
PCV-13
|
30
|
10
|
|
CFU 160,000 (Fisher test p < 0.001)
|
|
Saline
|
27
|
40
|
|
PCV-13
|
47
|
20
|
|
All doses (Fisher test p < 0.001)
|
|
Saline
|
68
|
59
|
|
PCV-13
|
96
|
31
|
In addition we will summarise the model fit in a table reporting the estimated coefficients, their standard errors and associated p-values.
modRes<-summary(mod)$coefficients %>%
as.data.frame() %>%
mutate(
parameter=case_when(
rownames(summary(mod)$coefficients)=="(Intercept)"~"(Intercept)",
rownames(summary(mod)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
rownames(summary(mod)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
rownames(summary(mod)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000"
)
)
colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")
modRes %>%
dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","(DUMMY TABLE) Summary of the log-binomial regression model fit. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),")."),digits=3) %>%
kable_styling(full_width = FALSE)
Table 4.3: (DUMMY TABLE) Summary of the log-binomial regression model fit. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 59%, 95% CI (48.7%, 71.6%). There is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.52, 95% CI (0.37, 0.74), p < 0.001).
|
|
Estimate
|
Std. error
|
Z statistic
|
p-value
|
|
(Intercept)
|
-0.527
|
0.098
|
-5.363
|
0.000
|
|
PCV-13 vaccine
|
-0.646
|
0.177
|
-3.657
|
0.000
|
|
Dose: CFU 80,000
|
-0.337
|
0.181
|
-1.865
|
0.062
|
|
Dose: CFU 20,000
|
-1.498
|
0.478
|
-3.131
|
0.002
|
We will summarise the carriage data in a graph as well.
plotDat<-simDat %>%
group_by(vaccInoDose,VaccName,dose) %>%
summarise(
n=n(),
k=sum(carriage),
.groups="drop") %>%
mutate(
prop=k/n,
propLow=NA,
propUpp=NA
)
plotDatTmp<-simDat %>%
group_by(VaccName) %>%
summarise(
n=n(),
k=sum(carriage),
.groups="drop") %>%
mutate(
prop=k/n,
propLow=NA,
propUpp=NA
) %>%
mutate(
dose="(any dose)",
vaccInoDose=paste(sep=" ",VaccName,"(any dose)")
)
plotDat<-rbind(plotDat,plotDatTmp) %>%
dplyr::select(vaccInoDose,VaccName,dose,n,k,prop,propLow,propUpp)
for(j in 1:nrow(plotDat)){
ci<-binom.test(x=plotDat$k[j],n=plotDat$n[j])$conf.int
plotDat$propLow[j]<-ci[1]
plotDat$propUpp[j]<-ci[2]
}
rm(ci,plotDatTmp)
plotDat$dose<-fct_recode(factor(plotDat$dose,levels=c("20000","80000","160000","(any dose)")),"20,000 CFU"="20000","80,000 CFU"="80000","160,000 CFU"="160000","(any dose)"="(any dose)")
plotDat<-plotDat[order(plotDat$VaccName,plotDat$dose),]
plotDat<-plotDat %>%
mutate(col=c(blueCols,orangeCols))
plotDat$vaccInoDose<-factor(plotDat$vaccInoDose,levels=c("Saline 20,000","Saline 80,000","Saline 160,000","Saline (any dose)","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000","PCV-13 (any dose)"))
yMax<-max(c(plotDat$propUpp))
g1<-plotDat %>%
filter(VaccName=="Saline") %>%
ggplot(mapping=aes(x=dose,y=prop,ymin=propLow,ymax=propUpp,fill=dose,label=paste(sep=" / ",k,n))) +
geom_bar(stat="identity",alpha=0.9) +
geom_errorbar(width=0.2,col="grey") +
geom_text(y=-0.02,col="grey50",fontface = "bold") +
scale_fill_manual(values=c(rgb(70+40,130+40,180+40,maxColorValue=255),rgb(70,130,180,maxColorValue=255),rgb(70-40,130-40,180-40,maxColorValue=255),rgb(70-70,130-100,180-100,maxColorValue=255))) +
xlab("Study arm & inoculation dose") +
ylab("Carriage proportion") +
guides(fill="none") +
theme_light() +
labs(title="Saline") +
ylim(c(0,yMax)) +
geom_vline(xintercept=3.5,col="darkgrey",lwd=1.25,lty=2) +
theme(axis.text.x = element_text(angle = 45, hjust=1))
g2<-plotDat %>%
filter(VaccName=="PCV-13") %>%
ggplot(mapping=aes(x=dose,y=prop,ymin=propLow,ymax=propUpp,fill=dose,label=paste(sep=" / ",k,n))) +
geom_bar(stat="identity",alpha=0.9) +
geom_errorbar(width=0.2,col="grey") +
geom_text(y=-0.02,col="grey50",fontface = "bold") +
scale_fill_manual(values=c(rgb(245+10,165+40,0,maxColorValue=255),rgb(245,165,0,maxColorValue=255),rgb(245-10,165-40,0,maxColorValue=255),rgb(245-30,165-120,0,maxColorValue=255))) +
xlab("Study arm & inoculation dose") +
ylab("Carriage proportion") +
guides(fill="none") +
theme_light() +
labs(title="PCV-13") +
ylim(c(0,yMax)) +
geom_vline(xintercept=3.5,col="darkgrey",lwd=1.5,lty=2) +
theme(axis.text.x = element_text(angle = 45, hjust=1))
figCap<-paste(sep="","Log-binomial regression p-value for PCV-13 vaccination coefficient, p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001.\n Carriers out of total participants are indicated below each bar."))
grid.arrange(g1,g2,nrow=1,bottom=grid::textGrob(figCap, just="left"))
Sensitivity analysis: confounding by natural carriage
Natural carriage may act as a confounder on experimental carriage. To investigate this we will first tabulate the proportions of experimental carriage among i) individuals who were natural carriers at visit C and those who were not carriers at that visit, ii) individuals who were natural carriers at any of visits C, E, F, G and those who did not become natural carriers at all. We will stratify these tabulations by randomisation arm.
datTabNatCarr<-data.frame(
vaccName=c(rep(levels(simDat$VaccName)[1],2),rep(levels(simDat$VaccName)[2],2)),
natCarr=rep(c("No natural carriage","Natural carriage"),2),
k=NA,
n=NA,
perc=NA
)
datTmp<-table(simDat$carriageNot6B,simDat$carriage,simDat$VaccName)
for(j in 1:nrow(datTabNatCarr)){
datTabNatCarr$k[j]<-datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"1",datTabNatCarr$vaccName[j]]
datTabNatCarr$n[j]<-datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"0",datTabNatCarr$vaccName[j]]+datTmp[ifelse(datTabNatCarr$natCarr[j]=="no","0","1"),"1",datTabNatCarr$vaccName[j]]
}
datTabNatCarr$perc<-paste(sep="",format(nsmall=2,round(digits=2,100*datTabNatCarr$k/datTabNatCarr$n)),"%")
datTabNatCarr %>%
dplyr::select(!vaccName) %>%
knitr::kable(col.names=c("","k","n","Percentage of 6B carriers"),caption="(DUMMY TABLE) Tabulated counts and percentages of experimental carriers stratified by natural carriage and randomisation arm. Natural carriage is defined here as being a carrier of a non-6B strain at any of visits C, E, F, G.") %>%
kableExtra::kable_styling(full_width = FALSE) %>%
kableExtra::pack_rows(group_label="Saline",1,2) %>%
kableExtra::pack_rows(group_label="PCV-13",3,4)
Table 4.4: (DUMMY TABLE) Tabulated counts and percentages of experimental carriers stratified by natural carriage and randomisation arm. Natural carriage is defined here as being a carrier of a non-6B strain at any of visits C, E, F, G.
|
|
k
|
n
|
Percentage of 6B carriers
|
|
Saline
|
|
No natural carriage
|
48
|
65
|
73.85%
|
|
Natural carriage
|
11
|
25
|
44.00%
|
|
PCV-13
|
|
No natural carriage
|
30
|
88
|
34.09%
|
|
Natural carriage
|
1
|
10
|
10.00%
|
datTmp<-table(simDat$carriageNot6BVisitC,simDat$carriage,simDat$VaccName)
for(j in 1:nrow(datTabNatCarr)){
datTabNatCarr$k[j]<-datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"1",datTabNatCarr$vaccName[j]]
datTabNatCarr$n[j]<-datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"0",datTabNatCarr$vaccName[j]]+datTmp[ifelse(datTabNatCarr$natCarr[j]=="no","0","1"),"1",datTabNatCarr$vaccName[j]]
}
datTabNatCarr$perc<-paste(sep="",format(nsmall=2,round(digits=2,100*datTabNatCarr$k/datTabNatCarr$n)),"%")
datTabNatCarr %>%
dplyr::select(!vaccName) %>%
knitr::kable(col.names=c("","k","n","Percentage of 6B carriers"),caption="(DUMMY TABLE) Tabulated counts and percentages of experimental carriers stratified by natural carriage and randomisation arm. Natural carriage is defined here as being a carrier of a non-6B strain at visit C.") %>%
kableExtra::kable_styling(full_width = FALSE) %>%
kableExtra::pack_rows(group_label="Saline",1,2) %>%
kableExtra::pack_rows(group_label="PCV-13",3,4)
Table 4.5: (DUMMY TABLE) Tabulated counts and percentages of experimental carriers stratified by natural carriage and randomisation arm. Natural carriage is defined here as being a carrier of a non-6B strain at visit C.
|
|
k
|
n
|
Percentage of 6B carriers
|
|
Saline
|
|
No natural carriage
|
56
|
69
|
81.16%
|
|
Natural carriage
|
3
|
5
|
60.00%
|
|
PCV-13
|
|
No natural carriage
|
30
|
96
|
31.25%
|
|
Natural carriage
|
1
|
2
|
50.00%
|
We will not be powered to do a formal statistical comparison of experimental carriage proportion between these groups. However, if it appears that there may be substantially different experimental carriage proportions between groups, we will do a sensitivity analysis, fitting the same log-binomial model as for the primary analysis but including a term for natural carriage. Whether this natural carriage term is defined based on visit C carriage or carriage at any of visits C, E, F, G will be informed by which definition appears to have the largest difference in experimental carriage between groups.
\[
\log(\pi) = \log\left(E[Y|X_{20,000},X_{80,000},X_{PCV13},X_{nat\, carr}]\right) = \beta_0 + \beta_{20,000}\cdot X_{20,000} + \beta_{80,000}\cdot X_{80,000} + \beta_{PCV13}\cdot X_{PCV13} + \beta_{nat\,carr}\cdot X_{nat\,carr}
\]
The results from this model will be summarised just as the primary model results are summarised.
Secondary analyses
Descriptive analyses
We will calculate carriage proportions per study arm at each follow-up visits (visits E, F, G).
plotDatLine<-simDat %>%
dplyr::select(RID,dose,VaccName,carriageVisitE,carriageVisitF,carriageVisitG) %>%
pivot_longer(cols=c(carriageVisitE,carriageVisitF,carriageVisitG),names_to="visit",values_to="carriage") %>%
mutate(visit=gsub(visit,pattern="carriage",replacement="")) %>%
group_by(VaccName,visit,dose) %>%
summarise(
n=n(),
k=sum(carriage),
.groups="drop") %>%
mutate(
prop=k/n,
propLow=NA,
propUpp=NA
) %>%
mutate(
visitNum=case_when(
visit=="VisitE"~2,
visit=="VisitF"~7,
visit=="VisitG"~14,
TRUE~NA_real_
),
vaccInoDose=factor(paste(sep=" ",VaccName,format(dose,big.mark=",",trim=TRUE)),levels=c("Saline 20,000","Saline 80,000","Saline 160,000","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000"))
)
for(j in 1:nrow(plotDatLine)){
ci<-binom.test(x=plotDatLine$k[j],n=plotDatLine$n[j])$conf.int
plotDatLine$propLow[j]<-ci[1]
plotDatLine$propUpp[j]<-ci[2]
rm(ci)
}
plotDatLineAll<-simDat %>%
dplyr::select(RID,dose,VaccName,carriageVisitE,carriageVisitF,carriageVisitG) %>%
pivot_longer(cols=c(carriageVisitE,carriageVisitF,carriageVisitG),names_to="visit",values_to="carriage") %>%
mutate(visit=gsub(visit,pattern="carriage",replacement="")) %>%
group_by(VaccName,visit) %>%
summarise(
n=n(),
k=sum(carriage),
.groups="drop") %>%
mutate(
prop=k/n,
propLow=NA,
propUpp=NA
) %>%
mutate(
visitNum=case_when(
visit=="VisitE"~2,
visit=="VisitF"~7,
visit=="VisitG"~14,
TRUE~NA_real_
),
vaccInoDose=fct_recode(VaccName,"Saline (any dose)"="Saline","PCV-13 (any dose)"="PCV-13")
) %>%
mutate(dose="All")
for(j in 1:nrow(plotDatLineAll)){
ci<-binom.test(x=plotDatLineAll$k[j],n=plotDatLineAll$n[j])$conf.int
plotDatLineAll$propLow[j]<-ci[1]
plotDatLineAll$propUpp[j]<-ci[2]
rm(ci)
}
plotDatLine<-rbind(plotDatLine,plotDatLineAll)
plotDatLine$dose<-fct_recode(factor(plotDatLine$dose,levels=c("20000","80000","160000","All")),"20,000 CFU"="20000","80,000 CFU"="80000","160,000 CFU"="160000","Any dose"="All")
plotDatLine$vaccInoDose<-factor(plotDatLine$vaccInoDose,levels=c("Saline 20,000","Saline 80,000","Saline 160,000","Saline (any dose)","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000","PCV-13 (any dose)"))
yMax<-max(plotDatLine$propUpp)
plotDatLine %>%
ggplot(mapping=aes(x=visitNum,y=prop,ymin=propLow,ymax=propUpp,col=VaccName)) +
geom_line(lty=2,lwd=1) +
geom_point(size=3) +
geom_errorbar(width=0.3,lwd=0.65) +
scale_color_manual(values=c("steelblue","orange"),name="Study arm") +
#scale_color_manual(values=c(blueCols,orangeCols),name="Vaccination / inoculation arm") +
xlab("Study visit (days since inoculation)") +
ylab("Carriage proportion") +
xlim(c(0,15)) +
ylim(c(0,yMax)) +
guides(fill="none") +
theme_light() +
labs(caption="Confidence intervals are exact binomial confidence intervals.\n") +
facet_wrap("dose")
Carriage density
Unless stated otherwise, the density measurements used in these analyses are the culture-derived CFU measurements.
Conditional on a participant being colonised with pneumococcus at a given visit, we will also compare carriage density between vaccination groups, inoculation dose and serotype (6B induced carriage or other serotype natural carriage). Specifically, we will use generalised estimating equations (GEE) with an exchangeable correlation structure to fit a linear regression model, with Gaussian errors and log link function, to the data from visits C, E, F, G, restricted to data only from participants that were carriers at that visit, with carriage density as response and vaccination status, inoculation dose and type of carriage as predictors.
We will repeat this analysis, stratified by inoculation dose (note that any inoculation dose group with too few carriers for model fitting will skipped for this analysis).
simDatLongCarriersOnly <- simDat %>%
dplyr::select(RID,VaccName,dose,densityVisitC,densityVisitE,densityVisitF,densityVisitG,carriageSerotypeVisitC,carriageSerotypeVisitE,carriageSerotypeVisitF,carriageSerotypeVisitG) %>%
pivot_longer(cols=-c(RID,VaccName,dose),names_to=c(".value","Visit"),names_pattern="(.+)Visit(.+)") %>%
mutate(
serotype6B=ifelse(carriageSerotype=="6B",1,0)
) %>%
filter(!is.na(density))
geeModCarrDens<-geeglm(density~VaccName+serotype6B+factor(dose),id=factor(RID),corstr="exchangeable",data=simDatLongCarriersOnly,family=stats::gaussian("log"))
geeSum<-summary(geeModCarrDens)$coefficients
colnames(geeSum)[colnames(geeSum)=="Pr(>|W|)"]<-"p.value"
geeSum %>%
mutate(
parameter=case_when(
rownames(summary(geeModCarrDens)$coefficients)=="(Intercept)"~"(Intercept)",
rownames(summary(geeModCarrDens)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
rownames(summary(geeModCarrDens)$coefficients)=="serotype6B"~"Serotype 6B",
rownames(summary(geeModCarrDens)$coefficients)=="factor(dose)80000"~"Dose: CFU 80,000",
rownames(summary(geeModCarrDens)$coefficients)=="factor(dose)160000"~"Dose: CFU 160,000"
)
) %>%
as.data.frame() %>%
dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="(DUMMY TABLE) Results from the GEE model fit to carriage density data. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
kable_styling(full_width = FALSE)
Table 4.6: (DUMMY TABLE) Results from the GEE model fit to carriage density data. Model fitted only to data from participants who were carriers at the time of each given visit.
|
|
Estimate
|
Std. error
|
Wald statistic
|
p-value
|
|
(Intercept)
|
8.0395361
|
0.3416454
|
553.7463036
|
0.0000
|
|
PCV-13 vaccine
|
-2.1430151
|
0.2263122
|
89.6673821
|
0.0000
|
|
Serotype 6B
|
-0.1644315
|
0.2466826
|
0.4443172
|
0.5050
|
|
Dose: CFU 80,000
|
-0.1662984
|
0.3663960
|
0.2060036
|
0.6499
|
|
Dose: CFU 160,000
|
-0.4189573
|
0.3249730
|
1.6620537
|
0.1973
|
geeModCarrDens20<-geeglm(density~VaccName+serotype6B,id=factor(RID),corstr="exchangeable",data=simDatLongCarriersOnly %>% dplyr::filter(dose==20000),family=stats::gaussian("log"))
geeSum20<-summary(geeModCarrDens20)$coefficients
colnames(geeSum20)[colnames(geeSum20)=="Pr(>|W|)"]<-"p.value"
geeSum20 %>%
mutate(
parameter=case_when(
rownames(summary(geeModCarrDens20)$coefficients)=="(Intercept)"~"(Intercept)",
rownames(summary(geeModCarrDens20)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
rownames(summary(geeModCarrDens20)$coefficients)=="serotype6B"~"Serotype 6B"
)
) %>%
as.data.frame() %>%
dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="(DUMMY TABLE) Results from the GEE model fit to carriage density data for the CFU 20,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
kable_styling(full_width = FALSE)
Table 4.7: (DUMMY TABLE) Results from the GEE model fit to carriage density data for the CFU 20,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.
|
|
Estimate
|
Std. error
|
Wald statistic
|
p-value
|
|
(Intercept)
|
8.433599
|
0.2524273
|
1116.228391
|
0.0000
|
|
PCV-13 vaccine
|
-3.648147
|
0.3452342
|
111.664989
|
0.0000
|
|
Serotype 6B
|
-1.279717
|
0.4906794
|
6.801934
|
0.0091
|
geeModCarrDens80<-geeglm(density~VaccName+serotype6B,id=factor(RID),corstr="exchangeable",data=simDatLongCarriersOnly %>% dplyr::filter(dose==80000),family=stats::gaussian("log"))
geeSum80<-summary(geeModCarrDens80)$coefficients
colnames(geeSum80)[colnames(geeSum80)=="Pr(>|W|)"]<-"p.value"
geeSum80 %>%
mutate(
parameter=case_when(
rownames(summary(geeModCarrDens80)$coefficients)=="(Intercept)"~"(Intercept)",
rownames(summary(geeModCarrDens80)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
rownames(summary(geeModCarrDens80)$coefficients)=="serotype6B"~"Serotype 6B"
)
) %>%
as.data.frame() %>%
dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="(DUMMY TABLE) Results from the GEE model fit to carriage density data for the CFU 80,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
kable_styling(full_width = FALSE)
Table 4.8: (DUMMY TABLE) Results from the GEE model fit to carriage density data for the CFU 80,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.
|
|
Estimate
|
Std. error
|
Wald statistic
|
p-value
|
|
(Intercept)
|
7.799455
|
0.2657263
|
861.5084201
|
0.0000
|
|
PCV-13 vaccine
|
-2.083580
|
0.3549150
|
34.4644709
|
0.0000
|
|
Serotype 6B
|
-0.067325
|
0.3749711
|
0.0322372
|
0.8575
|
geeModCarrDens160<-geeglm(density~VaccName+serotype6B,id=factor(RID),corstr="exchangeable",data=simDatLongCarriersOnly %>% dplyr::filter(dose==160000),family=stats::gaussian("log"))
geeSum160<-summary(geeModCarrDens160)$coefficients
colnames(geeSum160)[colnames(geeSum160)=="Pr(>|W|)"]<-"p.value"
geeSum160 %>%
mutate(
parameter=case_when(
rownames(summary(geeModCarrDens160)$coefficients)=="(Intercept)"~"(Intercept)",
rownames(summary(geeModCarrDens160)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
rownames(summary(geeModCarrDens160)$coefficients)=="serotype6B"~"Serotype 6B"
)
) %>%
as.data.frame() %>%
dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="(DUMMY TABLE) Results from the GEE model fit to carriage density data for the CFU 160,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
kable_styling(full_width = FALSE)
Table 4.9: (DUMMY TABLE) Results from the GEE model fit to carriage density data for the CFU 160,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.
|
|
Estimate
|
Std. error
|
Wald statistic
|
p-value
|
|
(Intercept)
|
7.5572930
|
0.2403989
|
988.2513481
|
0.0000
|
|
PCV-13 vaccine
|
-2.0766338
|
0.2848111
|
53.1625940
|
0.0000
|
|
Serotype 6B
|
-0.0706854
|
0.3186049
|
0.0492215
|
0.8244
|
For graphical representation we will show average carriage densities at each visit, stratified by serotype (6B or other) and inoculation dose group.
simDatLongCarriersOnly %>%
group_by(Visit,VaccName,dose,serotype6B) %>%
summarise(
densityMedian=median(density),
densityQ25=quantile(density,probs=0.25),
densityQ75=quantile(density,probs=0.75),
.groups="drop"
) %>%
complete(Visit,VaccName,dose,serotype6B) %>%
mutate(
serotype6B=case_when(
serotype6B==0~"Not 6B",
serotype6B==1~"6B"
)
) %>%
ggplot(mapping=aes(x=Visit,y=densityMedian,ymin=densityQ25,ymax=densityQ75,fill=VaccName)) +
geom_bar(stat="identity",position=position_dodge()) +
geom_errorbar(width=0.5,position=position_dodge(width=0.9),col="darkgrey",lwd=0.5) +
theme_light() +
ylab("Carriage density") +
facet_wrap(~serotype6B+factor(paste(sep=" ","CFU",format(dose,big.mark=",",trim=TRUE)),levels=c("CFU 20,000","CFU 80,000","CFU 160,000"))) +
scale_fill_manual(values=c("steelblue","orange")) +
scale_y_continuous(trans="log10")
simDatLongCarriersOnly %>%
ggplot(mapping=aes(x=paste(sep=" ",Visit,VaccName),y=density)) +
geom_boxplot(mapping=aes(fill=VaccName),col="darkgrey",alpha=0.25) +
geom_jitter(mapping=aes(col=VaccName,group=VaccName),height=0,width=0.1,alpha=0.75) +
theme_light() +
ylab("Carriage density") +
xlab("Visit") +
facet_wrap(~serotype6B+factor(paste(sep=" ","CFU",format(dose,big.mark=",",trim=TRUE)),levels=c("CFU 20,000","CFU 80,000","CFU 160,000"))) +
scale_color_manual(values=c("steelblue","orange"),name="Study arm") +
scale_fill_manual(values=rep("white",2),name="Study arm") +
scale_y_continuous(trans="log10") +
scale_x_discrete(labels=c("Visit","C","Visit","E","Visit","F","Visit","G"))
Carriage duration
Given that there are only a small number of fixed visits where carriage is assessed, a formal analysis of duration data will not be performed (both start and end times of carriage events are all either left, interval or right censored). However we will provide a visual summary of the duration data, by carriage type (6B or other serotype), study arm and visit of first recorded carriage. Specifically we will show mean carriage durations as well as the standard error for this mean duration.
Given the censored nature of carriage start and end times, we need to make a few pragmatic decisions for the visual summaries. If carriage is only recorded at the last visit (visit G), then no duration can be inferred. Where carriage ceases between any 2 visits, the duration is computed as if it occurred at the mid-point of the interval given by the two visits. Where carriage is still ongoing at the last visit (visit G), carriage duration is recorded as if it lasted until visit G. The start time of carriage is taken as the visit at which carriage was first detected.
durationDat6B<-simDat %>%
filter(carriage==1 & visitFirstCarriage6B!="G") %>%
group_by(VaccName,visitFirstCarriage6B) %>%
summarise(
dur6BMean=mean(durationFirstCarriage6B,na.rm=T),
dur6BSE=sd(durationFirstCarriage6B,na.rm=T)/sum(!is.na(durationFirstCarriage6B)),
.groups="drop"
) %>%
tidyr::complete(VaccName,visitFirstCarriage6B)
durationDatNot6B<-simDat %>%
filter(carriageNot6B==1 & visitFirstCarriageNot6B!="G") %>%
group_by(VaccName,visitFirstCarriageNot6B) %>%
summarise(
durNot6BMean=mean(durationFirstCarriageNot6B,na.rm=T),
durNot6BSE=sd(durationFirstCarriageNot6B,na.rm=T)/sum(!is.na(durationFirstCarriageNot6B)),
.groups="drop"
) %>%
tidyr::complete(VaccName,visitFirstCarriageNot6B)
durationDat<-data.frame(
VaccName=c(durationDat6B$VaccName,durationDatNot6B$VaccName),
visitFirstCarriage=c(durationDat6B$visitFirstCarriage6B,durationDatNot6B$visitFirstCarriageNot6B),
durMean=c(durationDat6B$dur6BMean,durationDatNot6B$durNot6BMean),
durSE=c(durationDat6B$dur6BSE,durationDatNot6B$durNot6BSE),
type=c(rep("6B carriage",nrow(durationDat6B)),rep("Non-6B carriage",nrow(durationDatNot6B)))
)
durationDat %>%
ggplot(mapping=aes(x=factor(visitFirstCarriage,levels=sort(decreasing=T,unique(visitFirstCarriage))),y=durMean,ymin=durMean-durSE,ymax=durMean+durSE,fill=VaccName)) +
geom_bar(stat="identity",position=position_dodge()) +
geom_errorbar(width=0.5,position=position_dodge(width=0.9),col="darkgrey",lwd=0.5) +
scale_fill_manual(values=c("steelblue","orange")) +
coord_flip() +
xlab("Visit of first carriage") +
ylab("Mean carriage duration (days)") +
theme_light() +
labs(caption="Error bars indicate the standard error of the sample mean of carriage duration.\nAbsence of error bars means a single observation.\nAbsence of colour bar means no recorded first carriage at that visit for that vaccination group.") +
theme(axis.text.y=element_text(size=20)) +
facet_wrap("type",nrow = 2)
Total carriage density during study
We will define the total experimental carriage density during the study as the area under the time-density curve (tdAUC) for 6B colonisation. We will use the trapezoidal rule to calculate the tdAUC using the density measurements at visits E, F, G and assuming a visit C density of 0 CFU/ml. Likewise, density at any visit where no carriage was detected is taken to be 0 CFU/ml.
To calculate tdAUC, we use the actual density measurements, but for statistical tests and models we will use \(log_{10}(\mbox{tdAUC})\) as the tdAUC distribution is expected to be severely skewed (we restrict comparisons to carriers, so all tdAUCs used in analyses will be positive).
As per the main analysis of binary carriage status, we will compare \(log_{10}(\mbox{tdAUC})\) measurements for experimental carriers using a generalised linear model between PCV-13 and saline randomised participants while accounting for inoculation dose.
We will also compare the tdAUC for experimental carriers between PCV-13 vaccinated and saline randomised participants using the two-sample Mann-Whitney-Wilcoxon test, stratified by inoculation dose (for doses where there are no carriers in one of the 2 groups, this analysis will be skipped.).
We will summarise these results in 2 types of graph: 1) box and jitter plots showing the carriage densities in the different study arms, stratified by inoculation dose and 2) line graphs showing the individual and average colonisation density profiles over time in the two groups, again stratified by inoculation dose.
options(scipen=99)
simDat<-simDat %>%
dplyr::mutate(
density6BVisitE=case_when(carriageVisitE==1~densityVisitE,TRUE~0),
density6BVisitF=case_when(carriageVisitF==1~densityVisitF,TRUE~0),
density6BVisitG=case_when(carriageVisitG==1~densityVisitG,TRUE~0),
densityNot6BVisitC=case_when(carriageNot6BVisitC==1~densityVisitC,TRUE~0),
densityNot6BVisitE=case_when(carriageNot6BVisitE==1~densityVisitE,TRUE~0),
densityNot6BVisitF=case_when(carriageNot6BVisitF==1~densityVisitF,TRUE~0),
densityNot6BVisitG=case_when(carriageNot6BVisitG==1~densityVisitG,TRUE~0)
)
tdAUCVect<-function(dateVec,densVec){
n<-length(dateVec)
if(length(densVec)!=n){stop("dateVec and densVec need to be of the same length.")}
densVec[is.na(densVec)]<-0
#densVec<-log10(densVec+1)
midPoints<-(densVec[1:(n-1)]+densVec[2:n])/2
widths<-as.numeric(ymd(dateVec[2:n])-ymd(dateVec[1:(n-1)]))
auc<-sum(midPoints*widths)
return(auc)
}
tdAUCMat<-function(dateCols,densCols){
n<-ncol(dateCols)
if(ncol(densCols)!=n){stop("dateCols and densCols need to have the same number of columns.")}
densCols[is.na(densCols)]<-0
#densCols<-log10(densCols+1)
for(j in 1:n){dateCols[,j]<-ymd(dateCols[,j])}
midPoints<-(densCols[,1:(n-1)]+densCols[,2:n])/2
widths<-sapply(FUN=as.numeric,X=dateCols[,2:n]-dateCols[,1:(n-1)])
auc<-rowSums(midPoints*widths)
return(auc)
}
simDat<-simDat %>%
dplyr::mutate(tdAUC=tdAUCMat(dateCols=simDat %>% dplyr::select(visitC_date,visitE_date,visitF_date,visitG_date),densCols=cbind(0,simDat %>% dplyr::select(density6BVisitE,density6BVisitF,density6BVisitG))))
tdAUCmod<-glm(log10(tdAUC)~VaccName+doseGroup,data=simDat %>% filter(carriage==1))
tdAUCwilcox20<-try(silent=TRUE,wilcox.test(simDat$tdAUC[simDat$doseGroup=="CFU 20000" & simDat$carriage==1 & simDat$VaccName=="PCV-13"],simDat$tdAUC[simDat$doseGroup=="CFU 20000" & simDat$carriage==1 & simDat$VaccName=="Saline"]))
tdAUCwilcox80<-try(silent=TRUE,wilcox.test(simDat$tdAUC[simDat$doseGroup=="CFU 80000" & simDat$carriage==1 & simDat$VaccName=="PCV-13"],simDat$tdAUC[simDat$doseGroup=="CFU 80000" & simDat$carriage==1 & simDat$VaccName=="Saline"]))
tdAUCwilcox160<-try(silent=TRUE,wilcox.test(simDat$tdAUC[simDat$doseGroup=="CFU 160000" & simDat$carriage==1 & simDat$VaccName=="PCV-13"],simDat$tdAUC[simDat$doseGroup=="CFU 160000" & simDat$carriage==1 & simDat$VaccName=="Saline"]))
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 20000","CFU 80000","CFU 160000"))
wilcoxResDf<-simDat %>%
dplyr::select(carriage,doseGroup,VaccName,tdAUC) %>%
dplyr::filter(carriage==1) %>%
dplyr::group_by(doseGroup,VaccName) %>%
dplyr::summarise(
median=format(nsmall=2,round(digits=2,(median(tdAUC)))),
IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(tdAUC,prob=0.25))),",",format(nsmall=2,round(digits=2,quantile(tdAUC,prob=0.75))),")"),
.groups="drop"
)
wilcoxResDf<-tidyr::complete(data=wilcoxResDf,doseGroup,VaccName,fill=list(median="-",IQR="-")) %>%
tidyr::pivot_wider(id_cols=doseGroup,names_from=VaccName,values_from=c(median,IQR)) %>%
dplyr::mutate(p.value=NA)
for(j in 1:nrow(wilcoxResDf)){
if(
wilcoxResDf$doseGroup[j]=="CFU 20000" & class(tdAUCwilcox20)!="try-error"){wilcoxResDf$p.value[j]<-format(nsmall=4,round(digits=4,tdAUCwilcox20$p.value))
}else if(
wilcoxResDf$doseGroup[j]=="CFU 80000" & class(tdAUCwilcox80)!="try-error"){wilcoxResDf$p.value[j]<-format(nsmall=4,round(digits=4,tdAUCwilcox80$p.value))
}else if(
wilcoxResDf$doseGroup[j]=="CFU 160000" & class(tdAUCwilcox160)!="try-error"){wilcoxResDf$p.value[j]<-format(nsmall=4,round(digits=4,tdAUCwilcox160$p.value))
}else{
wilcoxResDf$p.value[j]<-"-"
}
if(wilcoxResDf$p.value[j]=="0.0000"){wilcoxResDf$p.value[j]<-"<0.0001"}
}
wilcoxResDf<-wilcoxResDf[,c("doseGroup","median_PCV-13","IQR_PCV-13","median_Saline","IQR_Saline","p.value")]
wilcoxResDf %>%
knitr::kable(col.names=c("Dose","Median","IQR","Median","IQR","p.value"),caption="(DUMMY TABLE) Mann-Whitney-Wilcoxon test result for comparing tdAUC between study arms by inoculation group.") %>%
kableExtra::kable_styling(full_width=FALSE) %>%
kableExtra::add_header_above(c(" "=1,"PCV-13"=2,"Saline"=2," "=1))
Table 4.10: (DUMMY TABLE) Mann-Whitney-Wilcoxon test result for comparing tdAUC between study arms by inoculation group.
|
|
PCV-13
|
Saline
|
|
|
Dose
|
Median
|
IQR
|
Median
|
IQR
|
p.value
|
|
CFU 20000
|
960.46
|
(960.46,960.46)
|
25627.25
|
(17283.38,46463.54)
|
0.5000
|
|
CFU 80000
|
3092.21
|
(2511.96,4103.05)
|
19776.78
|
(11100.59,31174.54)
|
0.0007
|
|
CFU 160000
|
2284.43
|
(682.84,3445.72)
|
12207.65
|
(5792.58,28070.39)
|
<0.0001
|
simDat %>%
filter(carriage==1) %>%
ggplot(mapping=aes(x=VaccName,y=tdAUC,col=VaccName)) +
geom_boxplot(alpha=0.5) +
geom_jitter(height=0,width=0.25,alpha=0.5) +
theme_light() +
scale_color_manual(values=c("steelblue","orange")) +
guides(color="none") +
facet_wrap(~doseGroup,nrow=1) +
xlab("") +
ylab("total density AUC (CFU days/ml)") +
scale_y_continuous(trans = "log10") +
labs(title=paste(sep="","Conditional on carriage and on inoculation dose, the effect of PCV-13 vaccination on log110(tdAUC) is ",round(digits=2,summary(tdAUCmod)$coefficients["VaccNamePCV-13","Estimate"])," (p=",round(digits=4,summary(tdAUCmod)$coefficients["VaccNamePCV-13","Pr(>|t|)"]),")."))
simDatDensLong<-simDat %>%
dplyr::mutate(density6BVisitC=0) %>%
dplyr::select(c(PID,RID,VaccName,doseGroup,carriage,visitC_date,visitE_date,visitF_date,visitG_date,density6BVisitC,density6BVisitE,density6BVisitF,density6BVisitG)) %>%
dplyr::mutate(
timeSinceVisitC_VisitC=0,
timeSinceVisitC_VisitE=as.numeric(ymd(visitE_date)-ymd(visitC_date)),
timeSinceVisitC_VisitF=as.numeric(ymd(visitF_date)-ymd(visitC_date)),
timeSinceVisitC_VisitG=as.numeric(ymd(visitG_date)-ymd(visitC_date))
) %>%
dplyr::select(!starts_with("visit"))
colnames(simDatDensLong)<-gsub(pattern="density6B",replacement="density6B_",colnames(simDatDensLong))
simDatDensLong<-simDatDensLong %>%
tidyr::pivot_longer(cols=contains("Visit"),names_pattern="(.*)_Visit(.)",names_to=c("measurement","visit"),values_to="value") %>%
tidyr::pivot_wider(id_cols=c(PID,RID,VaccName,doseGroup,carriage,visit),names_from="measurement",values_from="value")
simDatDensLongAvg<-simDatDensLong %>%
filter(carriage==1) %>%
dplyr::group_by(VaccName,doseGroup,timeSinceVisitC) %>%
dplyr::summarise(
density6B=median(density6B)
)
simDatDensLong %>%
ggplot(mapping=aes(x=timeSinceVisitC,y=density6B,lty=PID,col=VaccName)) +
geom_line(alpha=0.65) +
theme_light() +
geom_line(data=simDatDensLongAvg,mapping=aes(x=timeSinceVisitC,y=density6B,col=VaccName),lwd=1.75,lty=1) +
xlab("Days since inoculation visit.") +
ylab("Carriage density (CFU/ml)") +
scale_y_continuous(trans = "log10") +
scale_color_manual(values=c("steelblue","orange")) +
scale_linetype_manual(values=sample(2:6,replace=TRUE,size=length(unique(simDatDensLong$PID)))) +
guides(linetype="none",color="none") +
facet_wrap(~VaccName+doseGroup,nrow=2) +
labs(title="Density time profiles by study arm and inoculation dose.",caption="(DUMMY TABLE) Dashed and dotted lines show individual profiles.\nThe thick solid lines show the median profiles among experimental carriers in each group.")
Effect of PCV-13 vaccine on natural, vaccine-type carriage
Given that the PCV-13 vaccine is meant to protect against 13 serotypes of S. pneumoniae, we will investigate whether we observe such a protective effect against vaccine-type serotypes.
However, as we are likely to observe much lower rates of natural carriage than experimental carriage we are almost surely not powered for this analysis and hence will primarily focus largely on a descriptive analysis, summarising vaccine-type carriage events in a 2x2 table and performing a simple Fisher exact test.
We will however attempt to fit the same model as for the primary analysis, just with natural, vaccine-type carriage as endpoint. If this model converges, we will report the results, but will be careful with over-interpreting the result given the expected low power for this analysis.
simDat<-simDat %>%
dplyr::mutate(
carriageNot6BVaccineType=case_when(
is.na(carriageSerotypeNot6B)~0,
!is.na(carriageSerotypeNot6B) & carriageSerotypeNot6B=="NVT"~0,
TRUE~1
))
tabVT<-table(simDat$VaccName,simDat$carriageNot6BVaccineType)
pFishVT<-fisher.test(tabVT)$p.value
pFishVT<-ifelse(pFishVT>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFishVT)))," < 0.001")
tabVT %>%
kable(caption=paste(sep="","(DUMMY TABLE) Number of participants colonised with vaccine-type S. Pneumoniae by study arm. Exact Fisher test p-value ",pFishVT,"."),col.names=c("Not colonised","Colonised"),row.names=T) %>%
kable_styling(full_width = F)
Table 4.11: (DUMMY TABLE) Number of participants colonised with vaccine-type S. Pneumoniae by study arm. Exact Fisher test p-value = 0.108.
|
|
Not colonised
|
Colonised
|
|
Saline
|
116
|
11
|
|
PCV-13
|
123
|
4
|
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 160000","CFU 80000","CFU 20000"))
modNot6B<-logbin::logbin(carriageNot6B~VaccName+doseGroup,data=simDat)
pGlm<-summary(modNot6B)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
rrGlmVacc<-exp(c(summary(modNot6B)$coefficients["VaccNamePCV-13","Estimate"],confint(modNot6B)["VaccNamePCV-13",]))
modRes<-summary(modNot6B)$coefficients %>%
as.data.frame() %>%
mutate(
parameter=case_when(
rownames(summary(modNot6B)$coefficients)=="(Intercept)"~"(Intercept)",
rownames(summary(modNot6B)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
rownames(summary(modNot6B)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
rownames(summary(modNot6B)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000"
)
)
colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")
modRes %>%
dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","(DUMMY TABLE) Summary of the log-binomial regression model fit for natural carriage. The baseline natural carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of natural carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),")."),digits=3) %>%
kable_styling(full_width = FALSE)
Table 4.12: (DUMMY TABLE) Summary of the log-binomial regression model fit for natural carriage. The baseline natural carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 21%, 95% CI (13.9%, 33.0%). There is a statistically significant effect of PCV-13 vaccine on the probability of natural carriage (RR 0.40, 95% CI (0.20, 0.80), p = 0.009).
|
|
Estimate
|
Std. error
|
Z statistic
|
p-value
|
|
(Intercept)
|
-1.541
|
0.221
|
-6.971
|
0.000
|
|
PCV-13 vaccine
|
-0.918
|
0.352
|
-2.608
|
0.009
|
|
Dose: CFU 80,000
|
-0.291
|
0.371
|
-0.786
|
0.432
|
|
Dose: CFU 20,000
|
0.000
|
0.421
|
0.000
|
1.000
|
Sub-group analyses
We will repeat the above analyses stratified by sex.
Inoculum densities
We will compute the average before and after inoculation densities of the inocula that were prepared, summarise these by their medians, inter-quartile ranges and ranges per target dose and visualise them on boxplots.
inoDatLong<-inoDat %>%
tidyr::pivot_longer(cols=contains("doseConc"),names_to="type",values_to="doseConc") %>%
dplyr::mutate(type=gsub(pattern="doseConc",replacement="",type)) %>%
dplyr::mutate(
doseGroup=factor(levels=c("CFU 20,000","CFU 80,000","CFU 160,000"),case_when(
dose==20000~"CFU 20,000",
dose==80000~"CFU 80,000",
dose==160000~"CFU 160,000",
TRUE~NA_character_
))
) %>%
dplyr::mutate(
type=factor(levels=c("Before","After","Average"),case_when(
type=="Pre"~"Before",
type=="Post"~"After",
type=="Avg"~"Average"
))
)
inoDatSumPre<-inoDatLong %>%
dplyr::filter(type=="Before") %>%
dplyr::group_by(doseGroup) %>%
dplyr::summarise(
median=format(nsmall=0,big.mark=",",trim=TRUE,round(median(doseConc))),
iqr=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0.25,0.75))))),")"),
range=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0,1))))),")"),
)
inoDatSumPost<-inoDatLong %>%
dplyr::filter(type=="After") %>%
dplyr::group_by(doseGroup) %>%
dplyr::summarise(
median=format(nsmall=0,big.mark=",",trim=TRUE,round(median(doseConc))),
iqr=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0.25,0.75))))),")"),
range=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0,1))))),")"),
)
inoDatSumAvg<-inoDatLong %>%
dplyr::filter(type=="Average") %>%
dplyr::group_by(doseGroup) %>%
dplyr::summarise(
median=format(nsmall=0,big.mark=",",trim=TRUE,round(median(doseConc))),
iqr=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0.25,0.75))))),")"),
range=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0,1))))),")"),
)
rbind(inoDatSumPre,inoDatSumPost,inoDatSumAvg) %>%
kable(row.names=FALSE,col.names=c("","Median","IQR","Range"),caption="(DUMMY TABLE) Medians, inter-quartile ranges (IQRs) and ranges for the inoculum dose measurements before and after inoculation. Also shown are the summaries for the average measurements.") %>%
kable_styling(full_width = FALSE) %>%
pack_rows(group_label="Before inoculation",1,3) %>%
pack_rows(group_label="After inoculation",4,6) %>%
pack_rows(group_label="Average",7,9)
Table 4.13: (DUMMY TABLE) Medians, inter-quartile ranges (IQRs) and ranges for the inoculum dose measurements before and after inoculation. Also shown are the summaries for the average measurements.
|
|
Median
|
IQR
|
Range
|
|
Before inoculation
|
|
CFU 20,000
|
20,103
|
(19,855, 20,403)
|
(17,450, 21,074)
|
|
CFU 80,000
|
79,495
|
(76,326, 81,892)
|
(71,891, 88,038)
|
|
CFU 160,000
|
158,925
|
(154,043, 163,252)
|
(148,221, 177,970)
|
|
After inoculation
|
|
CFU 20,000
|
19,807
|
(17,886, 20,438)
|
(16,664, 22,644)
|
|
CFU 80,000
|
78,986
|
(75,331, 81,855)
|
(65,570, 92,501)
|
|
CFU 160,000
|
159,107
|
(150,590, 163,871)
|
(135,606, 184,496)
|
|
Average
|
|
CFU 20,000
|
19,831
|
(19,022, 20,521)
|
(17,057, 21,859)
|
|
CFU 80,000
|
79,068
|
(75,781, 81,435)
|
(69,470, 89,630)
|
|
CFU 160,000
|
158,664
|
(153,966, 162,928)
|
(144,248, 175,458)
|
inoDatLong %>%
ggplot(mapping=aes(x=type,y=doseConc,col=doseGroup)) +
geom_boxplot(fill="white",alpha=0.5) +
geom_jitter(height=0,alpha=0.5) +
geom_hline(mapping=aes(yintercept=ifelse(dose==20000,0.5*20000,NA)),col=blueCols[2],lty=2) +
geom_hline(mapping=aes(yintercept=ifelse(dose==20000,2*20000,NA)),col=blueCols[2],lty=2) +
geom_hline(mapping=aes(yintercept=ifelse(dose==80000,0.5*80000,NA)),col=blueCols[3],lty=2) +
geom_hline(mapping=aes(yintercept=ifelse(dose==80000,2*80000,NA)),col=blueCols[3],lty=2) +
geom_hline(mapping=aes(yintercept=ifelse(dose==160000,0.5*160000,NA)),col=blueCols[4],lty=2) +
geom_hline(mapping=aes(yintercept=ifelse(dose==160000,2*160000,NA)),col=blueCols[4],lty=2) +
theme_minimal() +
theme(legend.position = "none") +
scale_color_manual(values=blueCols[2:4]) +
xlab("Measurement type") +
ylab("Dose (CFU)") +
ylim(c(0,1.5*160000)) +
labs(caption="Dashed lines indicate tolerable ranges.") +
facet_wrap(~doseGroup)
Should we observe large amounts of variability in the dose measurements, we will repeat the above descriptive analyses stratified by study arm to check that no inadvertent imbalances in inoculation dose occurred across arms. Given that block randomisation is used together with the fact that participants inoculated on the same day are being inoculated with the same inoculum, this is extremely unlikely and hence such a stratified descriptive analysis not planned by default.
Comparison of culture and lytA/6B derived density measurements
We have stated that the culture derived density measurements are used as default in the above listed main analysis investigations. However, we also measure densities derived from lytA/6B multiplex qPCR. We will compare the two sets of density measurements.
Specifically, we will
- Produce scatter plots of culture (x-axis) and lytA/6B multiplex qPCR (y-axis) measurements, stratified by study arm, dose group and 6B and not-6B serotypes.
- Compute differences in log10(density+1) measurements between culture and lytA/6B multiples qPCR results and summarise these as histograms, stratified by 6B and not-6B serotypes.
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 20000","CFU 80000","CFU 160000"))
dens6B<-simDat %>%
dplyr::select(c(PID,doseGroup,VaccName,density6BVisitE,density6BVisitF,density6BVisitG)) %>%
tidyr::pivot_longer(cols=c(density6BVisitE,density6BVisitF,density6BVisitG),names_to="visit",values_to="density6B") %>%
dplyr::mutate(visit=gsub(visit,pattern="density6B",replacement=""))
densNot6B<-simDat %>%
dplyr::select(c(PID,doseGroup,VaccName,densityNot6BVisitC,densityNot6BVisitE,densityNot6BVisitF,densityNot6BVisitG)) %>%
tidyr::pivot_longer(cols=c(densityNot6BVisitC,densityNot6BVisitE,densityNot6BVisitF,densityNot6BVisitG),names_to="visit",values_to="densityNot6B") %>%
dplyr::mutate(visit=gsub(visit,pattern="densityNot6B",replacement=""))
densDatLong<-densNot6B %>%
dplyr::mutate(density6B=dens6B$density6B[match(paste(sep="_",PID,visit),paste(sep="_",dens6B$PID,dens6B$visit))]) %>%
tidyr::pivot_longer(cols=c(densityNot6B,density6B),values_to="density",names_to="serotype") %>%
dplyr::mutate(serotype=gsub(serotype,pattern="density",replacement=""))
densDatLong<-densDatLong %>%
dplyr::mutate(density_lytA=rnorm(nrow(densDatLong),mean=density,sd=density/10)) %>%
dplyr::mutate(density_lytA=ifelse(density_lytA<0,0,density_lytA)) %>%
dplyr::mutate(density_log10Diff=log10(density+1)-log10(density_lytA+1))
g1<-densDatLong %>%
dplyr::filter(serotype=="6B") %>%
ggplot(mapping=aes(x=log10(density+1),y=log10(density_lytA+1),col=VaccName)) +
geom_point() +
scale_color_manual(values=c("steelblue","orange")) +
guides(color="none") +
theme_light() +
xlab("log10(culture density [CFU/ul] + 1)") +
ylab("log10(lytA/6B density [copies/ml] + 1)") +
facet_wrap(~doseGroup+VaccName,nrow=3) +
labs(title="6B")
g2<-densDatLong %>%
dplyr::filter(serotype=="Not6B") %>%
ggplot(mapping=aes(x=log10(density+1),y=log10(density_lytA+1),col=VaccName)) +
geom_point() +
scale_color_manual(values=c("steelblue","orange")) +
guides(color="none") +
theme_light() +
xlab("log10(culture density [CFU/ul] + 1)") +
ylab("log10(lytA/6B density [copies/ml] + 1)") +
facet_wrap(~VaccName,nrow=1) +
labs(title="Other serotypes")
grid.arrange(g1,g2,nrow=2,heights=c(3,1.3))
g1<-densDatLong %>%
dplyr::filter(serotype=="6B") %>%
dplyr::filter(density>0 | density_lytA>0) %>%
ggplot(mapping=aes(x=density_log10Diff,fill=VaccName)) +
geom_histogram(col="white",bins=20) +
scale_fill_manual(values=c("steelblue","orange")) +
guides(fill="none") +
theme_light() +
xlab("log10(culture density [CFU/ul] + 1) - log10(lytA/6B density [copies/ml] + 1)") +
ylab("") +
facet_wrap(~doseGroup+VaccName,nrow=3) +
labs(title="6B")
g2<-densDatLong %>%
dplyr::filter(serotype=="Not6B") %>%
dplyr::filter(density>0 | density_lytA>0) %>%
ggplot(mapping=aes(x=density_log10Diff,fill=VaccName)) +
geom_histogram(col="white",bins=20) +
scale_fill_manual(values=c("steelblue","orange")) +
guides(fill="none") +
theme_light() +
xlab("log10(culture density [CFU/ul]+ 1) - log10(lytA/6B density [copies/ml] + 1)") +
ylab("") +
facet_wrap(~VaccName,nrow=1) +
labs(title="Other serotypes")
grid.arrange(g1,g2,nrow=2,heights=c(3,1.3))
Finally, as a sensitivity analysis, we will repeat the above listed main analyses (investigations of differences in carriage density by inoculation group, study arms and serotype) using the lytA/6B density measurements.
Exploratory analyses
Dose-response curve and effective dose ED50
Using data from the saline arm only, we will use the data from the different inoculation dose groups to estimate a dose-response curve. Specifically, we will fit a logistic regression model with log dose as the predictor variable (equivalent to a 2-parameter log-logistic dose-response model Ritz et al. (2015)). For this we will use the actual dose of inoculum that was administered, defined as the average of the pre- and post-administration dose measurements, rather than the specified target dose.
Using the same notation as above, writing \(Y\) for the carriage status (i.e. the response) and \(X_{conc}\) for the actual inoculation dose measured in CFU, we fit the following model:
\[
\mbox{logit}\left(E[Y|X_{conc}]\right) = \beta_0 + \beta_1\cdot\log(X_{conc})
\]
where \(Y|X_{conc}\) follows a Bernoulli distribution with parameter \(p=E[Y|X_{conc}]\).
From this model we will derive an estimate of the effective dose \(ED50\), the dose at which on average 50% of participants become colonised with serotype 6B:
\[
ED50 = \exp\left(-\beta_0/\beta_1\right)
\]
To derive a 95% confidence intervals for \(ED50\), we will use bootstrap resampling of the data with B=10,000 samples. For each bootstrap sample, we will compute an estimate of \(ED50\), thus building up an empirical distribution of \(ED50\) values. We will use the percentile method to summarise that empiral distribution as a 95% confidence interval.
### fit DRC model
#modDrc<-drc::drm(carriage~log(doseConcAvg),data=simDat %>% dplyr::filter(VaccName=="Saline"),fct=LL.2(),type="binomial")
modDrcGlm<- glm(carriage~log(doseConcAvg),data=simDat %>% dplyr::filter(VaccName=="Saline"),family=binomial)
edFun<-function(modGlm,prob=0.5){
cfs<-coef(summary(modGlm))[,"Estimate"]
ed<-exp((log(prob/(1-prob))-cfs[1])/cfs[2])
return(ed)
}
ed50<-edFun(modDrcGlm)
B<-1e4
ed50Vect<-rep(NA,B)
for(b in 1:B){
simDatBS<-simDat[sample(1:nrow(simDat),size=nrow(simDat),replace=TRUE),]
modDrcGlmBS<- glm(carriage~log(doseConcAvg),data=simDatBS %>% dplyr::filter(VaccName=="Saline"),family=binomial)
ed50Vect[b]<-edFun(modDrcGlmBS)
}
ed50CI<-quantile(ed50Vect,probs=c(0.025,0.975))
ed50Mod<-format(nsmall=0,trim=TRUE,big.mark=",",round(digits=0,c(ed50,ed50CI)))
modDrcSum<-summary(modDrcGlm)$coefficients %>%
as.data.frame()
modDrcSum<-modDrcSum %>%
mutate(parameter=rownames(modDrcSum))
colnames(modDrcSum)<-c("Estimate","Std.Error","z.value","p.value","parameter")
modDrcSum %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
dplyr::select(!parameter) %>%
kable(escape=FALSE,col.names=c("Estimate","Std. error","z value","p value"),caption=paste(sep="","(DUMMY TABLE) Summary of the dose-response model fitted to data from participants randomised to saline. ED50, the minimum dose that will on average result in 50% carriage is estimated to be ",ed50Mod[1]," with 95% CI (",paste(collapse=", ",ed50Mod[2:3]),").")) %>%
kableExtra::kable_styling(full_width = FALSE)
Table 4.14: (DUMMY TABLE) Summary of the dose-response model fitted to data from participants randomised to saline. ED50, the minimum dose that will on average result in 50% carriage is estimated to be 111,093 with 95% CI (67,017, 228,945).
|
|
Estimate
|
Std. error
|
z value
|
p value
|
|
(Intercept)
|
-9.0894793
|
3.164699
|
-2.872146
|
0.0041
|
|
log(doseConcAvg)
|
0.7823536
|
0.274970
|
2.845232
|
0.0044
|
dfPred<-data.frame(
logDoseConcAvg=seq(log(1),log(200000),length=1000)
) %>%
mutate(
doseConcAvg=exp(logDoseConcAvg)
)
tmpPred<-as.data.frame(predict(modDrcGlm,newdata=dfPred,interval="link",se.fit=TRUE))
invLogit<-function(x){
res<-exp(x)/(1+exp(x))
return(res)
}
dfPred<-dfPred %>%
dplyr::mutate(
y=invLogit(tmpPred$fit),
yLow=invLogit(tmpPred$fit-qnorm(0.975)*tmpPred$se.fit),
yUpp=invLogit(tmpPred$fit+qnorm(0.975)*tmpPred$se.fit),
)
tmpDat<-simDat %>%
dplyr::filter(VaccName=="Saline") %>%
dplyr::group_by(dose) %>%
dplyr::summarise(
n=n(),
k=sum(carriage),
propCarriage=mean(carriage)
) %>%
mutate(
propCarriageLow=NA,
propCarriageUpp=NA
)
for(j in 1:nrow(tmpDat)){
tmp<-binom.test(x=tmpDat$k[j],n=tmpDat$n[j])$conf.int
tmpDat$propCarriageLow[j]<-tmp[1]
tmpDat$propCarriageUpp[j]<-tmp[2]
}
ggplot() +
geom_segment(mapping=aes(x=ed50,xend=ed50,y=0,yend=0.5),lty=2,lwd=0.5,col="steelblue") +
geom_segment(mapping=aes(x=0,xend=ed50,y=0.5,yend=0.5),lty=2,lwd=0.5,col="steelblue") +
geom_ribbon(mapping=aes(xmin=ed50CI[1],xmax=ed50CI[2],y=seq(0,1,length=100)),fill="steelblue",alpha=0.25) +
geom_line(data=dfPred,mapping=aes(x=doseConcAvg,y=y),col="orange",lwd=1) +
geom_ribbon(data=dfPred,mapping=aes(x=doseConcAvg,ymin=yLow,ymax=yUpp),fill="orange",alpha=0.25) +
geom_point(data=simDat,mapping=aes(x=doseConcAvg,y=carriage),col="darkgrey",alpha=0.5) +
geom_point(data=tmpDat,mapping=aes(x=dose,y=propCarriage),col="black",size=3) +
geom_errorbar(dat=tmpDat,mapping=aes(x=dose,ymin=propCarriageLow,ymax=propCarriageUpp),width=2000) +
xlab("Dose (CFU)") +
ylab("Probability of carriage") +
labs(title="Dose-response curve estimated from the saline randomised study participants.",caption=paste(sep="","A 2-parameter log-logistic model was used for the dose-response curve estimation.\nGrey dots show the individual carriage data as a function of actual dose delivered.\nBlack dots with error bars show the estimated proportion of carriage by target dose and the associated 95% confidence intervals.\nDashed blue lines and band indicate the ED50 threshold dose ",ed50Mod[1]," CFU with the 95% CI (",paste(collapse=", ",ed50Mod[2:3]),").")) +
theme_light()
Temporal dynamics of pneumococcal colonisation
The study will not run over long enough a period of time to properly investigate the presence or absence of seasonal effects. Using flexible regression models, we can however investigate whether there seems to be evidence for temporal variation of colonisation probabilities (for both experimental and natural carriage). Specifically we can investigate:
Whether there is temporal variation in the probability of natural colonisation, i.e. carriage of vaccine-type serotypes other than the experimental 6B strain.
Whether there is temporal variation in the probability of experimental colonisation by serotype 6B.
The first would most likely point to variation in pneumoccocal exposure over time, whereas the second could point to variations in susceptivility of colonisation over time.
For these investigations, we will refit the model from the main outcome analysis, but include restricted cubic spline terms for time of Visit C for each participant since Visit C of the first study participant.
To test the null hypothesis of no temporal variation, we will perform likelihood ratio tests comparing the 2 models to the respective models without the time variable.
As for the secondary analysis for natural, vaccine-type carriage, we need to point out that there may be far too few natural carriage events to fit the natural carriage model. Also the study was not powered to investigate temporal investigations, so we will report results mostly descriptively and for the purpose of hypothesis generation.
firstVisitC<-min(ymd(simDat$visitC_date))
simDat<-simDat %>%
dplyr::mutate(timeVisitC_sinceStudyStart=as.numeric(ymd(visitC_date)-firstVisitC))
tmp<-rcs(simDat$timeVisitC_sinceStudyStart) # need to manually add the spline terms as no direct support by logbin
simDat<-simDat %>%
dplyr::mutate(
timeVisitC_rcs1=as.numeric(tmp[,1]),
timeVisitC_rcs2=as.numeric(tmp[,2]),
timeVisitC_rcs3=as.numeric(tmp[,3]),
timeVisitC_rcs4=as.numeric(tmp[,4])
)
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 160000","CFU 80000","CFU 20000"))
modTimeNot6B<-logbin::logbin(carriageNot6B~VaccName+doseGroup+timeVisitC_rcs1+timeVisitC_rcs2+timeVisitC_rcs3+timeVisitC_rcs4,data=simDat,start=c(coef(mod),rep(0,4)),method="ab") # if boundary issues with EM algorithm, we will use method="glm2" or method="ab"
pGlm<-summary(modTimeNot6B)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
pLrt<-lmtest::lrtest(modTimeNot6B,modNot6B)$`Pr(>Chisq)`[2]
rrGlmVacc<-exp(c(summary(modTimeNot6B)$coefficients["VaccNamePCV-13","Estimate"],confint(modTimeNot6B)["VaccNamePCV-13",]))
modRes<-summary(modTimeNot6B)$coefficients %>%
as.data.frame() %>%
mutate(
parameter=case_when(
rownames(summary(modTimeNot6B)$coefficients)=="(Intercept)"~"(Intercept)",
rownames(summary(modTimeNot6B)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
rownames(summary(modTimeNot6B)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
rownames(summary(modTimeNot6B)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000",
rownames(summary(modTimeNot6B)$coefficients)=="timeVisitC_rcs1"~"Time since study start",
rownames(summary(modTimeNot6B)$coefficients)=="timeVisitC_rcs2"~"Time since study start'",
rownames(summary(modTimeNot6B)$coefficients)=="timeVisitC_rcs3"~"Time since study start''",
rownames(summary(modTimeNot6B)$coefficients)=="timeVisitC_rcs4"~"Time since study start'''"
)
)
colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")
modRes %>%
dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","(DUMMY TABLE) Summary of the log-binomial regression model fit for natural, vaccine-type carriage with terms for time since study start. The baseline natural carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of natural carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),"). There is ",ifelse(pLrt<0.05,"a","no")," statistically significant effect of time since study start on the probability of natural carriage."),digits=3) %>%
kable_styling(full_width = FALSE)
Table 4.15: (DUMMY TABLE) Summary of the log-binomial regression model fit for natural, vaccine-type carriage with terms for time since study start. The baseline natural carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 9%, 95% CI ( 0.2%, 457.8%). There is a statistically significant effect of PCV-13 vaccine on the probability of natural carriage (RR 0.41, 95% CI (0.21, 0.81), p = 0.011). There is no statistically significant effect of time since study start on the probability of natural carriage.
|
|
Estimate
|
Std. error
|
Z statistic
|
p-value
|
|
(Intercept)
|
-2.400
|
2.001
|
-1.200
|
0.230
|
|
PCV-13 vaccine
|
-0.892
|
0.351
|
-2.545
|
0.011
|
|
Dose: CFU 80,000
|
-0.348
|
1.046
|
-0.332
|
0.740
|
|
Dose: CFU 20,000
|
0.692
|
1.648
|
0.420
|
0.674
|
|
Time since study start
|
0.005
|
0.013
|
0.349
|
0.727
|
|
Time since study start’
|
0.001
|
0.020
|
0.040
|
0.968
|
|
Time since study start’’
|
-0.049
|
0.122
|
-0.402
|
0.687
|
|
Time since study start’’’
|
1.461
|
1.869
|
0.782
|
0.434
|
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 160000","CFU 80000","CFU 20000"))
modTime6B<-logbin::logbin(carriage~VaccName+doseGroup+timeVisitC_rcs1+timeVisitC_rcs2+timeVisitC_rcs3+timeVisitC_rcs4,data=simDat,start=c(coef(mod),rep(0,4)),method="ab")
pGlm<-summary(modTime6B)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
pLrt<-lmtest::lrtest(modTime6B,mod)$`Pr(>Chisq)`[2]
rrGlmVacc<-exp(c(summary(modTime6B)$coefficients["VaccNamePCV-13","Estimate"],confint(modTime6B)["VaccNamePCV-13",]))
modRes<-summary(modTime6B)$coefficients %>%
as.data.frame() %>%
mutate(
parameter=case_when(
rownames(summary(modTime6B)$coefficients)=="(Intercept)"~"(Intercept)",
rownames(summary(modTime6B)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
rownames(summary(modTime6B)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
rownames(summary(modTime6B)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000",
rownames(summary(modTime6B)$coefficients)=="timeVisitC_rcs1"~"Time since study start",
rownames(summary(modTime6B)$coefficients)=="timeVisitC_rcs2"~"Time since study start'",
rownames(summary(modTime6B)$coefficients)=="timeVisitC_rcs3"~"Time since study start''",
rownames(summary(modTime6B)$coefficients)=="timeVisitC_rcs4"~"Time since study start'''"
)
)
colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")
modRes %>%
dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","(DUMMY TABLE) Summary of the log-binomial regression model fit for experimental serotype 6B carriage with terms for time since study start. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),"). There is ",ifelse(pLrt<0.05,"a","no")," statistically significant effect of time since study start on the probability of experimental serotype 6B carriage."),digits=3) %>%
kable_styling(full_width = FALSE)
Table 4.16: (DUMMY TABLE) Summary of the log-binomial regression model fit for experimental serotype 6B carriage with terms for time since study start. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 30%, 95% CI ( 2.2%, 395.6%). There is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.53, 95% CI (0.37, 0.74), p < 0.001). There is no statistically significant effect of time since study start on the probability of experimental serotype 6B carriage.
|
|
Estimate
|
Std. error
|
Z statistic
|
p-value
|
|
(Intercept)
|
-1.213
|
1.320
|
-0.918
|
0.358
|
|
PCV-13 vaccine
|
-0.642
|
0.174
|
-3.690
|
0.000
|
|
Dose: CFU 80,000
|
-0.117
|
0.536
|
-0.218
|
0.828
|
|
Dose: CFU 20,000
|
-1.087
|
1.059
|
-1.026
|
0.305
|
|
Time since study start
|
0.007
|
0.010
|
0.633
|
0.527
|
|
Time since study start’
|
-0.016
|
0.015
|
-1.081
|
0.280
|
|
Time since study start’’
|
0.119
|
0.074
|
1.607
|
0.108
|
|
Time since study start’’’
|
-1.675
|
0.908
|
-1.845
|
0.065
|
Time between vaccination (visit B) and inoculation (visit D)
We will provide histograms and tables of medians and inter-quartile ranges (overall, by study arm, by inoculation dose and by experimental carriage status) of the time between visits B and D to describe the variation and to explore whether there are differences between study arms and between carriers and non-carriers.
Should there be substantial differences in time between vaccination and inoculation visits between carriers and non-carriers, we will include this variable in the main objective model for phase 1 as a sensitivity analysis.
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 20000","CFU 80000","CFU 160000"))
simDat <- simDat %>%
dplyr::mutate(
timeBtoD=as.numeric(ymd(visitC_date)-ymd(visitB_date)+7) # visit C is 1 week = 7 days before visit D
)
timeBtoD<-simDat %>%
dplyr::summarise(
median=median(timeBtoD),
IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.25))),",",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.75))),")")
)
timeBtoD<-cbind("All",timeBtoD)
colnames(timeBtoD)<-c("group","median","IQR")
timeBtoD_vacc<-simDat %>%
dplyr::group_by(VaccName) %>%
dplyr::summarise(
median=median(timeBtoD),
IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.25))),",",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.75))),")"),
.groups="drop"
)
colnames(timeBtoD_vacc)<-c("group","median","IQR")
pVacc<-wilcox.test(timeBtoD~VaccName,data=simDat)$p.value
timeBtoD_dose<-simDat %>%
dplyr::group_by(doseGroup) %>%
dplyr::summarise(
median=median(timeBtoD),
IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.25))),",",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.75))),")"),
.groups="drop"
)
colnames(timeBtoD_dose)<-c("group","median","IQR")
pDose<-kruskal.test(timeBtoD~doseGroup,data=simDat)$p.value
timeBtoD_carr<-simDat %>%
dplyr::group_by(carriage) %>%
dplyr::summarise(
median=median(timeBtoD),
IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.25))),",",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.75))),")"),
.groups="drop"
)
colnames(timeBtoD_carr)<-c("group","median","IQR")
pCarr<-wilcox.test(timeBtoD~carriage,data=simDat)$p.value
timeBtoD_carr[,1]<-case_when(timeBtoD_carr[,1]==0~"No",timeBtoD_carr[,1]==1~"Yes")
timeBtoD_fullTab<-rbind(timeBtoD,timeBtoD_vacc,timeBtoD_dose,timeBtoD_carr)
timeBtoD_fullTab %>%
knitr::kable(col.names=c(" ","Median","IQR"),digits=2,caption="(DUMMY TABLE) Median and inter-quartile range for the time between vaccination and inoculation visits overall, by study arm, inoculation dose and carriage status.") %>%
kableExtra::kable_styling(full_width=FALSE) %>%
kableExtra::pack_rows(group_label="Overall",start=1,end=1) %>%
kableExtra::pack_rows(group_label=paste(sep="","Study Arm (Mann-Whitney-Wilcoxon p = ",format(nsmall=4,round(digits=4,pVacc)),")."),start=2,end=3) %>%
kableExtra::pack_rows(group_label=paste(sep="","Inoculation dose (Kruskal-Wallis p = ",format(nsmall=4,round(digits=4,pDose)),")."),start=4,end=6) %>%
kableExtra::pack_rows(group_label=paste(sep="","Experimental 6B carriage (Mann-Whitney-Wilcoxon p = ",format(nsmall=4,round(digits=4,pCarr)),")."),start=7,end=8)
Table 4.17: (DUMMY TABLE) Median and inter-quartile range for the time between vaccination and inoculation visits overall, by study arm, inoculation dose and carriage status.
|
|
Median
|
IQR
|
|
Overall
|
|
All
|
15
|
(13.00,16.00)
|
|
Study Arm (Mann-Whitney-Wilcoxon p = 0.6188).
|
|
Saline
|
15
|
(13.00,16.00)
|
|
PCV-13
|
15
|
(13.50,16.00)
|
|
Inoculation dose (Kruskal-Wallis p = 0.7946).
|
|
CFU 20000
|
15
|
(13.75,16.00)
|
|
CFU 80000
|
15
|
(13.00,16.00)
|
|
CFU 160000
|
15
|
(14.00,16.00)
|
|
Experimental 6B carriage (Mann-Whitney-Wilcoxon p = 0.5836).
|
|
No
|
15
|
(13.00,16.00)
|
|
Yes
|
15
|
(13.25,16.00)
|
g1<-simDat %>%
ggplot(mapping=aes(x=timeBtoD)) +
geom_histogram(bins=length(unique(simDat$timeBtoD)),col="white") +
theme_light() +
xlab("") +
ylab("Frequency") +
labs(title="All participants")
g2<-simDat %>%
ggplot(mapping=aes(x=timeBtoD,fill=VaccName)) +
geom_histogram(bins=length(unique(simDat$timeBtoD)),col="white") +
theme_light() +
scale_fill_manual(values=c("steelblue","orange")) +
xlab("") +
ylab("Frequency") +
labs(title="By study arm.") +
guides(fill="none") +
facet_wrap(~VaccName)
g3<-simDat %>%
ggplot(mapping=aes(x=timeBtoD,fill=doseGroup)) +
geom_histogram(bins=length(unique(simDat$timeBtoD)),col="white") +
theme_light() +
scale_fill_manual(values=c("grey80","grey50","grey20")) +
xlab("") +
ylab("Frequency") +
labs(title="By inoculation dose.") +
guides(fill="none") +
facet_wrap(~doseGroup)
g4<-simDat %>%
ggplot(mapping=aes(x=timeBtoD,fill=ifelse(carriage==1,"Experimental carriage.","No experimental carriage"))) +
geom_histogram(bins=length(unique(simDat$timeBtoD)),col="white") +
theme_light() +
scale_fill_manual(values=c("grey40","grey20")) +
xlab("Time between visits B and D (days)") +
ylab("Frequency") +
labs(title="By experimental 6B carriage.") +
guides(fill="none") +
facet_wrap(~ifelse(carriage==1,"Experimental carriage.","No experimental carriage"))
grid.arrange(g1,g2,g3,g4,nrow=4)